Data-driven quantum chemical property prediction leveraging 3D conformations with Uni-Mol+

Quantum chemical (QC) property prediction is crucial for computational materials and drug design, but relies on expensive electronic structure calculations like density functional theory (DFT). Recent deep learning methods accelerate this process using 1D SMILES or 2D graphs as inputs but struggle to achieve high accuracy as most QC properties depend on refined 3D molecular equilibrium conformations. We introduce Uni-Mol+, a deep learning approach that leverages 3D conformations for accurate QC property prediction. Uni-Mol+ first generates a raw 3D conformation using RDKit then iteratively refines it towards DFT equilibrium conformation using neural networks, which is finally used to predict the QC properties. To effectively learn this conformation update process, we introduce a two-track Transformer model backbone and a novel training approach. Our benchmarking results demonstrate that the proposed Uni-Mol+ significantly improves the accuracy of QC property prediction in various datasets.

head on top of its architecture to do supervised pre-training.Could the author offer more insights beyond such a combination?Generally speaking, the proposed method itself reveals few insights to the research community and thus lacks enough significance.The task is well-studied while the model architecture / training objectives lack novelty.

Suggested Complemental Experiments and Ablations:
Since the Uni-Mol+ requires RDKit to generate the initial 3D conformation as input which is different from previous baselines, could the author provide an ablation study demonstrating the significance of such a module?To be specific, how is the performance of Uni-Mol+ when the starting 3D conformation is noisy or not as good as RDKit?Is the model showing consistency or robustness for the evaluation results by switching different conformer sampling modules, including but not limited to the Openbabel mentioned in the paper?
Could the authors provide the benchmark results on the QM9 to make a fair comparison with the baseline models such as Transformer-M?By comparing the Uni-Mol and Uni-Mol+ (this paper), the Uni-Mol involved the 3D position predictions task as the pre-training objectives while Uni-Mol+ merged it and the property prediction task into one single supervised training.Since one of the biggest merits of pre-training is to amortize the expensive training overhead and make finetuning accessible on downstream tasks, I am curious: what is the estimated training time for Uni-Mol+ for each task?
Could the author report the error bar or standard deviation for the evaluation metrics such as MAE?
How is the weight q encoded in the model?It would be good to see how the prediction performance changes along the pseudo trajectory with different q.The author can simply do an experiment by feeding the model with different noise interpolation q during inference.

Clarity of the Implementation details:
One of the core modules used in Uni-Mol+ is the RDKit which generates the initial 3D conformation of each input molecule.However, since RDKit is just a molecular toolkit, could the author please provide how they conduct such a generation pipeline in detail?For example, common practice using RDKit for conformer sampling involves the ETKDG [4] method to embed conformers in 3D space and apply MMFF94 for optimization.Is this the case for the Uni-Mol+?Plus, do the authors perform proper filtering for the RDKit-generated conformations?Please elaborate on this point.
One important component of the proposed Uni-Mol+ is the 3D position prediction head which enhances the supervision signals during training.However, there is no apparent description of the position prediction head can be found in the manuscript.Can the authors elaborate on this?Moreover, since the position is the molecular geometry in 3D space, how is the equivariance w.r.t. the spatial transformation considered in the architecture?Gps++: An optimised hybrid mpnn/transformer for molecular property prediction Guideline for NC peer review: Writing the review The primary purpose of the review is to provide the editors with the information needed to reach a decision but the review should also instruct the authors on how they can strengthen their paper to the point where it may be acceptable.As far as possible, a negative review should explain to the authors the major weaknesses of their manuscript, so that rejected authors can understand the basis for the decision and see in broad terms what needs to be done to improve the manuscript for publication elsewhere.Referees should be aware that when declined manuscripts are transferred to another journal in the Nature Portfolio portfolio the referee comments are also transferred, and can be used to determine suitability of publication at the receiving journal.In the case of manuscript transfers between Nature Portfolio journals with in-house editors, referee identities are also transferred.
Confidential comments to the editor are welcome, but they should not contradict the main points as stated in the comments for transmission to the authors.
We ask reviewers the following questions, to provide an assessment of the various aspects of a manuscript: Key results: Please summarise what you consider to be the outstanding features of the work.
Validity: Does the manuscript have flaws which should prohibit its publication?If so, please provide details.
Originality and significance: If the conclusions are not original, please provide relevant references.On a more subjective note, do you feel that the results presented are of immediate interest to many people in your own discipline, and/or to people from several disciplines?Data & methodology: Please comment on the validity of the approach, quality of the data and quality of presentation.Please note that we expect our reviewers to review all data, including any extended data and supplementary information.Is the reporting of data and methodology sufficiently detailed and transparent to enable reproducing the results?Appropriate use of statistics and treatment of uncertainties: All error bars should be defined in the corresponding figure legends; please comment if that's not the case.Please include in your report a specific comment on the appropriateness of any statistical tests, and the accuracy of the description of any error bars and probability values.
Conclusions: Do you find that the conclusions and data interpretation are robust, valid and reliable?Suggested improvements: Please list additional experiments or data that could help strengthening the work in a revision.
References: Does this manuscript reference previous literature appropriately?If not, what references should be included or excluded?Clarity and context: Is the abstract clear, accessible?Are abstract, introduction and conclusions appropriate?
Inflammatory material: Does the manuscript contain any language that is inappropriate or potentially libelous?Springer Nature is committed to diversity, equity and inclusion; please raise any concerns that may in your view have an impact on this commitment.
Please indicate any particular part of the manuscript, data, or analyses that you feel is outside the scope of your expertise, or that you were unable to assess fully.
Please address any other specific question asked by the editor via email.
Reports do not necessarily need to follow this specific order but should document the referees' thought process.All statements should be justified and argued in detail, naming facts and citing supporting references, commenting on all aspects that are relevant to the manuscript and that the referees feel qualified commenting on.Not all of the above aspects will necessarily apply to every paper, due to discipline-specific standards.When in doubt about discipline-specific refereeing standards, reviewer can contact the editor for guidance.
It is our policy to remain neutral with respect to jurisdictional claims in published maps and institutional affiliations, and the naming conventions used in maps and affiliation are left to the discretion of authors.Referees should not, therefore, request authors to make any changes to such unless it is critical to the clarity of the scientific content of a manuscript Reviewer #2 (Remarks to the Author): The authors study a significant problem predicting quantum properties from 2D molecule modality, which usually requires time-consuming DFT computation.Specifically, the author propose the Uni-Mol+ framework that models the precise molecule geometry from coarse ones given by RDKit.Experimental results and ablations show a significant performance improvement on multiple benchmarks and indicate the effectiveness of the proposed approach.
Below are some itemized concerns and suggestions: 1.The authors provide parameter comparison on multiple datasets, but the time cost is missing.In addition to the parameters, the iterations R is another variable to determine the overall computation cost.Since the goal of the studied problem is, in essence, to approximating DFT computation with affordable computation, it would be important to compare practical time cost among different approaches as well as DFT.In addition, could the authors perform study the effect of increasing R on molecules grouped by different sizes?Given the intuition that DFT computation cost grows significantly over molecule size, it's reasonable to assume that more computation is required for larger molecules, controlled by R in Uni-Mol+.Therefore it's important to make clear the scalability of Uni-Mol+.
2. The propose training strategy is novel and effective.And I appreciate the authors' discussion on the difference between the proposed noisy interpolation with Noisy Node.I wonder if the authors can further perform ablations comparing the two approach.It will add to the significance of this work since the noisy interpolation may be used as a fundamental technique in different task, modalities, and models.
3. The authors claim that their method refines initial conformations towards DFT equilibrium conformation.I suggest a quantitative evaluation of the optimized conformation's accuracy.For instance, comparing the RMSD between the refined conformation and the ground truth would be informative (beyond the selected samples' RMSD presented in Figure 2).Additionally, evaluating the performance of trained 3D GNN models, like SphereNet, in predicting QC properties from the optimized molecular geometries would be beneficial, considering the known accuracy of 3D GNN models in predicting QC properties from equilibrium conformations.4. In the OGB leaderboard, a method named "EGT+Tri.Attn.+RDKitCoords."outperforms Uni-Mol+.I recommend that the authors include this method in Table 1 for a comprehensive comparison. 5. Figure 3 currently only shows the energy difference between initial and predicted conformers, which offers limited information.Providing the energy difference between initial and equilibrium conformers would more accurately reflect the model's performance in optimizing molecular geometries.

Could the authors provide further insights into why integrating explicit 3D geometry prediction within the neural network is superior to maximizing mutual information between 2D and 3D molecular views during training?
Below are some additional questions: 1.The use of RDKit for generating initial conformations introduces some randomness.How does this affect the accuracy of QC property predictions?Additionally, how robust is the proposed method in optimizing equilibrium conformations from various initial conformations? 2. When the iteration count (R) is 1, leading to the prediction of two conformers, is the L1 loss on structures calculated for each predicted conformer or only the final one?Furthermore, is the QC property loss assessed based solely on the final conformer?
Response to Reviewers' Comments on "Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+" 1 Response to Reviewer #1 We are grateful to the reviewer for the insightful feedback.Here is our detailed response to each of your comments.Thank you for your suggestion.We've updated our manuscript to include references to AlphaFold2 in Sections 2 and 4.
We agree that at a concept level, Uni-Mol+ shares similarities with AlphaFold2's Evoformer, especially in how both use a two-track transformer approach (Node Representation and Pair Representation).However, the key difference lies in the type of input data each one uses.Evoformer is built to process protein sequences and multiple sequence alignments (MSAs), whereas Uni-Mol+ is designed for atoms and their 3D positions.This fundamental difference leads to several variations in their design.
For instance, Evoformer includes specific layers like MSARowAttention and MSAColumnAttention to handle MSAs, which are not needed in Uni-Mol+.In the Pair Representation, Uni-Mol+ uses a single TriangularUpdate for efficiency, while Evoformer uses four different Triangular operators.
Furthermore, Evoformer relies on a complex StructureModule to predict the protein structure from sequence representation, while Uni-Mol+ simply needs a SE(3) coordinate head.As a result, Uni-Mol+ is simpler and more efficient compared to Evoformer.We have elaborated on these differences in Supplementary Section 1. Thank you for the valuable suggestion.We have conducted additional experiments to assess the robustness of our model with varying input conformations.Specifically, we introduced Gaussian noise (with standard deviations of 0.1 and 0.3) to the initial RDKit conformations.The results, as detailed in Table 2, demonstrate that our model's performance is relatively unaffected by changes in the initial conformations.

Comparison with Uni-Mol and Noisy Nodes
Furthermore, we conducted an experiment starting from 2D conformations (with a flat z-axis) generated by RDKit's AllChem.Compute2DCoords.Despite the significant challenge posed by the absence of 3D information, the result is only a minor drop in performance and still largely outperform previous baselines.This finding underscores the robustness of Uni-Mol+: it maintains high performance levels even without 3D conformation inputs.We have elaborated on these discussions in Supplementary Section 3.For QM9, the task involves directly learning quantum chemistry (QC) properties based on DFT equilibrium conformation.However, Uni-Mol+ is designed to focus on tasks that do not require knowledge of DFT conformation during inference.As a result, direct comparisons and results on the QM9 dataset may not be appropriate for evaluating Uni-Mol+'s performance.As noted in Supplementary Section 2, training for the PCQM4MV2 task required approximately 5 days on 8 NVIDIA A100 GPUs, while the OC20 task took around 7 days on 16 NVIDIA A100 GPUs.

QM9 experiments
We acknowledge the benefits of pretraining, particularly in improving data efficiency for downstream tasks and potentially reducing training costs.Pretraining is especially beneficial for enhancing performance in tasks with limited labeled data, such as ADME/T tasks in molecular representation learning.Yet, for data-rich tasks such as PCQM4MV2 and OC20, the immediate need for pretraining to enhance performance is reduced.Crucially, pretraining is not only compatible with Uni-Mol+ but also complementary.In future work, we believe that incorporating pretraining could further improve Uni-Mol+'s data efficiency.

Error bars
▷ Review Comment: Could the author report the error bar or standard deviation for the evaluation metrics such as MAE?
We appreciate your suggestion and have accordingly trained additional Uni-Mol+ models using two distinct seeds on both the PCQM4MV2 and OC20 benchmarks.We've computed the mean and standard deviation of the inference results and updated the experimental tables (Table 1 and Table 2 in our manuscript).The minimal variation in the results confirms Uni-Mol+'s robust performance, consistently outperforming previous models despite accounting for randomness.
It's important to note, however, that due to leaderboard submission constraints, we're unable to provide error bars for the test sets.Moreover, previous baseline publications did not report error bars, and replicating their models to calculate these would be challenging.We are grateful for your insightful suggestions and have endeavored to provide the most comprehensive and reliable data within these constraints.

Strategy of sampling q
▷ Review Comment: How is the weight q encoded in the model?It would be good to see how the prediction performance changes along the pseudo trajectory with different q.The author can simply do an experiment by feeding the model with different noise interpolation q during inference.
As detailed in Section 4.2, during training, the sampling strategy of q uses a mixture of Bernoulli During inference, since the DFT equilibrium conformations are unknown, q is fixed to 1.0.
We value your input on evaluating inference performance across pseudo trajectories with varying q values.However, it's crucial to mention that the available validation and test sets in the experimental datasets do not provide DFT equilibrium conformations, rendering the proposed experiment unfeasible with the current data.Nevertheless, given that quantum chemistry (QC) properties are calculated from DFT equilibrium conformations, it's reasonable to assume that smaller q values (which are closer to equilibrium), would yield better results.

Clarity of the Implementation details
▷ Review Comment: Clarity of the Implementation details: One of the core modules used in Uni-Mol+ is the RDKit which generates the initial 3D conformation of each input molecule.However, since RDKit is just a molecular toolkit, could the author please provide how they conduct such a generation pipeline in detail?For example, common practice using RDKit for conformer sampling involves the ETKDG [4] method to embed conformers in 3D space and apply MMFF94 for optimization.Is this the case for the Uni-Mol+?Plus, do the authors perform proper filtering for the RDKit-generated conformations?Please elaborate on this point.
One important component of the proposed Uni-Mol+ is the 3D position prediction head which enhances the supervision signals during training.However, there is no apparent description of the position prediction head can be found in the manuscript.Can the authors elaborate on this?Moreover, since the position is the molecular geometry in 3D space, how is the equivariance w.r.t. the spatial transformation considered in the architecture?
Thank you for your inquiry regarding the implementation details.In response to your question about the use of RDKit for conformer sampling in our study, we indeed employ the ETKDG method to generate 3D conformations, followed by MMFF94 force field for optimization.In molecules where the generation of a 3D conformation is unsuccessful, we default to producing a 2D conformation with a flat z-axis using RDKit's AllChem.Compute2DCoords function instead.
There are no additional filtering rules.Our preprocessing scripts are publicly available for review and can be accessed at the following GitHub repository: https://github.com/dptech-corp/Uni-Mol/blob/main/unimol_plus/scripts/get_3d_lmdb.py.
Regarding the 3D position prediction head within Uni-Mol+, we have adopted the 3D prediction head proposed in Graphormer-3D [1].The architecture takes atom representation x L , pair representation p L , and initial coordinates c as inputs.An attention mechanism is initially employed and then the attention weights is multiplied point-wisely with the pairwise delta coordinates derived from the initial coordinates.The attention mechanism is denoted as: where and B h i,j is an attention bias term.A h is the attention weights, ∆(c) ij is the delta coordinate between c i and c j where the superscript 0, 1 and 2 represent the X axis, Y axis and Z axis respectively.Then the position prediction head predicts coordinate updates using three linear projections of the attention head values onto the three axes, which is denoted as: where ∆(c ′ ) is the predicted coordinate updates and c ′ is the predicted coordinates.
Regarding the equivariance, as described in the above formula, the coordinate prediction head used in our study does not inherently enforce strict equivariance.This challenge can be addressed through one of two strategies: (1) Strict equivariance of the model can be achieve by sharing the parameters across the three linear layers in Equation ( 2)-denoted as linear 1 , linear 2 , and linear 3 -and concurrently eliminating the bias terms within these layers; (2) the model's robustness to spatial transformations can be enhanced by incorporating random rotations into the input coordinates as a form of data augmentation.During our experimental phase, both techniques were rigorously tested.The latter approach-data augmentation via random rotations-yielded better accuracy in quantum chemistry property predictions and was thus selected for our model architecture.In this case, empirical evidence suggests that with a sufficiently large training dataset, such as the PCQM4MV2 dataset, the model naturally tends towards an equivariant state.Specifically, our observations indicate that the parameters of the three linear layers tend to converge to the same, and the bias terms asymptotically approach zero, with the discrepancies being marginal (on the order of 1e-4).
We have elaborated on these details in the main text (Section 2.2) and Supplementary Section 1.We appreciate your request for clarification and hope this response adequately addresses your concerns.

Response to Reviewer #2
We greatly appreciate the reviewer's constructive comments.Below, we provide a comprehensive response to each point raised.

Time Cost Regarding Molecular Sizes
▷ Review Comment: 1.The authors provide parameter comparison on multiple datasets, but the time cost is missing.In addition to the parameters, the iterations R is another variable to determine the overall computation cost.Since the goal of the studied problem is, in essence, to approximating DFT computation with affordable computation, it would be important to compare practical time cost among different approaches as well as DFT.
In addition, could the authors perform study the effect of increasing R on molecules grouped by different sizes?
Given the intuition that DFT computation cost grows significantly over molecule size, it's reasonable to assume that more computation is required for larger molecules, controlled by R in Uni-Mol+.Therefore it's important to make clear the scalability of Uni-Mol+.
We appreciate your valuable feedback.Addressing your concerns, we've conducted a detailed analysis of the computational time costs, particularly focusing on different molecular sizes.For this analysis, we chose a diverse set of molecules, grouping them by size, and calculated the average time required to process a single molecule in each category, including 50 molecules per size group.
We compared the time costs between Uni-Mol+ across different numbers of conformation update rounds (R) and traditional Density Functional Theory (DFT) calculations.Specifically, for Uni-Mol+, we assessed the computational time for molecules with up to 256 atoms.However, due to the substantial time demands of DFT calculations, we limited our DFT time cost analysis to molecules with a maximum of 50 atoms.
The computational evaluations for Uni-Mol+ were performed on a single NVIDIA V100 GPU, while the DFT calculations, including geometry optimization and quantum chemical energy computations, were conducted using Psi4 1.4.1 on 32 CPUs.The DFT calculations employed the B3LYP functional and 6-31G* basis set, aligning with the settings used in the PCQM4MV2 dataset.
We have presented these results in Fig. 1.The results clearly indicate that Uni-Mol+ not only provides a faster computational solution compared to DFT but also demonstrates superior scalability in relation to molecular size.Moreover, our findings reveal that the computational time cost associated with Uni-Mol+ increases linearly with the number of update rounds (R), affirming the predictability and efficiency of our method.We have elaborated on these discussions in Supplementary Section 3. We've included an additional ablation study result in our paper for a more comprehensive comparison, referring to No.18 in the below Table 3 (and Table 3  comparing the RMSD between the refined conformation and the ground truth would be informative (beyond the selected samples' RMSD presented in Figure 2).Additionally, evaluating the performance of trained 3D GNN models, like SphereNet, in predicting QC properties from the optimized molecular geometries would be beneficial, considering the known accuracy of 3D GNN models in predicting QC properties from equilibrium conformations.
Thank you for your suggestion regarding the quantitative evaluation of our method's accuracy in refining initial conformations towards DFT equilibrium conformation.We acknowledge the importance of a precise quantitative assessment in validating the efficacy of our approach.
It's crucial to note that within the PCQM4MV2 benchmark, 'ground truth' conformations for validation or test sets are not readily available.Consequently, as explicitly mentioned in our manuscript, we resorted to generating DFT conformations independently.It is, however, imperative to recognize that these self-generated equilibrium conformations may not precisely mirror the authentic ground truth conformations intrinsic to the PCQM4MV2 dataset.This discrepancy inherently complicates the process of conducting an accurate quantitative evaluation of the optimized conformation's precision.
As a result, we initially limited our analysis to presenting a qualitative evaluation through selected examples.
Nonetheless, acknowledging the value of your suggestion, we ventured to extend our analysis to the OC20 benchmark, where ground-truth equilibrium conformations are accessible within the validation dataset.We have conducted additional evaluations and benchmarked our results against the current state-of-the-art model, EquiFormer.The results, as depicted in Table 4, clearly demonstrate that Uni-Mol+ surpasses the previous baseline in predicting equilibrium conformations.We have elaborated on these discussions in Supplementary Section 3. Regarding your suggestion about employing trained 3D GNN models, like SphereNet, to assess performance based on the predicted equilibrium conformations, we believe this step might not be necessary for two main reasons.Firstly, Uni-Mol+ can predict QC properties based on the optimized conformations without the need for intermediary models.Secondly, using DFT calculations to derive QC properties from the optimized conformations ensures greater accuracy than 3D GNN modelbased predictions.This has been supported in our manuscript, specifically in Figure 3.The results presented in Figure 3 indicate that the conformations optimized by Uni-Mol+ reach a lower energy state compared to their initial states, underscoring the proficiency of Uni-Mol+ in conformation optimization.

Recent Leaderboard Results
▷ Review Comment: 4. In the OGB leaderboard, a method named "EGT+Tri.Attn.+RDKitCoords." outperforms Uni-Mol+.I recommend that the authors include this method in Table 1 for a comprehensive comparison.
We sincerely appreciate the reviewer's suggestion to include the "EGT+Tri.1.This footnote explicitly mentions the dates when the leaderboard was accessed, thereby maintaining the relevance and fairness of the comparison within the evolving research environment.

Energy Difference Between Initial and Equilibrium Conformers
▷ Review Comment: 5. Figure 3 currently only shows the energy difference between initial and predicted conformers, which offers limited information.Providing the energy difference between initial and equilibrium conformers would more accurately reflect the model's performance in optimizing molecular geometries.
We appreciate the reviewer's suggestion to illustrate the energy difference between the initial and equilibrium conformers, offering a more comprehensive view of the model's performance in optimizing molecular geometries.
However, it's important to note that the PCQM4MV2 dataset's validation set does not provide a ground truth equilibrium conformation or its corresponding energy, which poses a challenge in presenting these results directly.In response to this, we have independently calculated the equilibrium conformation and its energy by DFT, and we present this additional information in the newly added Fig. 2.This figure illustrates that the energy difference distribution between the initial and predicted conformations closely aligns with that between the initial and equilibrium conformations.This similarity demonstrates Uni-Mol+'s effectiveness in predicting equilibrium conformations accurately.
We have revised Figure 3  to smaller molecules.Interestingly, our data indicates that increasing "R" does not significantly benefit larger molecules, as the performance at R=1 and R=2 is nearly identical.
Upon further examination, we recognized that the amount of larger molecules is quite small in validation set, as shown in Table 2.This led us to investigate the training data's molecular size distribution, detailed in Table 3.Most molecules in the training set are within the size range of (10, 20], which correlates with where we observe the lowest validation MAE.This indicates the the higher errors on larger molecules is due to the distribution of training data.This could be further improved by a training dataset with a broader distribution of molecular sizes, or models that have better generalizability on molecule sizes.We are very grateful for the reviewer's comment, which have helped us find this problem.We will leave it to future work.We have elaborated on these discussions in Supplementary Section 3.

1. 1
Comparison with Evoformer of AlphaFold2 ▷ Review Comment: Better contextualize the proposed model: Since the backbone model of Uni-Mol+ has heavily overlapped with the Evoformer of AlphaFold2 (AF2) [1], it is better to cite AF2 properly in the main text (Section 2 & 4) to provide appropriate context for common readers.However, I only found the reference of AF2 in the supplementary material in the present version.How do the authors conduct the model architecture design and compare it to AF2/Evoformer?To be specific, given the current architecture is based on AF2/Evoformer, what are the differences between these two?If there exists any architecture deviation, what is the rationale for such model design?

1. 3 . 3
Training costs and comparison with pretraining ▷ Review Comment: By comparing the Uni-Mol and Uni-Mol+ (this paper), the Uni-Mol involved the 3D position predictions task as the pre-training objectives while Uni-Mol+ merged it and the property prediction task into one single supervised training.Since one of the biggest merits of pre-training is to amortize the expensive training overhead and make finetuning accessible on downstream tasks, I am curious: what is the estimated training time for Uni-Mol+ for each task?
distribution and Uniform distribution.The Bernoulli distribution addresses (1) the distributional shift between training and inference and (2) enhances the learning of an accurate mapping from the equilibrium conformation to the QC properties.Meanwhile, the Uniform distribution generates additional intermediate states to serve as model inputs, effectively augmenting the input conformations.

Figure 1 :
Figure 1: Time cost for different molecular sizes.(a) DFT versus Uni-Mol+ (with varying update rounds R); (b) Effect of different update rounds R in Uni-Mol+.

Figure 2 :Figure 1 :
Figure2: Distribution of delta energy.We selected 100 data points and used DFT to calculate the following values: (a) the delta energies between their initial and Uni-Mol+'s predicted conformations; (b) the delta energies between their initial conformations and the DFT conformations, where the DFT conformations are calculated by ourselves using DFT tool.Cross-marks indicate data points with increased energies, while circle-marks denote those with decreased energies.

Table 1 :
We've included two additional ablation study results in our paper for a more comprehensive comparison.These can be found as entries No.18 and No.19 in the below Table1(and Table3in the revised manuscript).Firstly, when we compare the results of No.18 and No.1, it's clear that the performance of Noisy Nodes (No.18, result 0.0760) is significantly lower than that of Uni-Mol+ (No.1, result 0.0696).It's important to note that the only difference between these two is the training strategy, while the model structure remains the same.This large performance gap (0.0760 vs. 0.0696) highlights the importance of our proposed training strategy.As we explained in Section 3, our training approach is effective because it additionally uses a Bernoulli distribution, which helps in dealing with distributional shifts and improves predictions of quantum chemistry (QC) properties.Secondly, a comparison between No.19 and No.18 shows that using the Uni-Mol's backbone yields slightly worse results than using Uni-Mol+'s backbone.However, this difference is not very large.These additional ablation studies confirm that Uni-Mol+ largely outperforms the combination of Uni-Mol and Noisy Nodes.The key factor behind this improvement is our proposed training strategy.Comparison with Noisy Nodes and Uni-Mol.
[3]eview Comment: As the authors put, "our work emphasizes a novel paradigm for QC property prediction, rather than developing a new model backbone."Howdotheauthorscomparethe Uni-Mol+ with the Uni-Mol[2](3D position recovery) + Noisy Node[3](interpolation strategy)?Though the original Uni-Mol pretraining does not involve energy prediction as a self-supervised task, one can easily append a property prediction head on top of its architecture to do supervised pre-training.Could the author offer more insights beyond such a combination?Generally speaking, the proposed method itself reveals few insights to the research community and thus lacks enough significance.The task is well-studied while the model architecture / training objectives lack novelty.▷Review Comment: Since the Uni-Mol+ requires RDKit to generate the initial 3D conformation as input which is different from previous baselines, could the author provide an ablation study demonstrating the significance of such a module?To be specific, how is the performance of Uni-Mol+ when the starting 3D conformation is noisy or not as good as RDKit?Is the model showing consistency or robustness for the evaluation results by switching different conformer sampling modules, including but not limited to the Openbabel mentioned in the paper?

Table 2 :
The benchmark results on PCQM4MV2, with different initial conformations.Could the authors provide the benchmark results on the QM9 to make a fair comparison with the baseline models such as Transformer-M?

Table 3 :
in the revised manuscript).Comparison with Noisy Nodes.Quantitative Analysis for the Optimized Conformations ▷ Review Comment: 3. The authors claim that their method refines initial conformations towards DFT equilibrium conformation.I suggest a quantitative evaluation of the optimized conformation's accuracy.For instance,

Table 4 :
RMSD for predicted conformations on OC20 valid set.
the pioneering Uni-Mol+, incorporating approaches such as RDKit-generated conformations and triangular operators, and also references Uni-Mol+'s paper in its publication.Furthermore, at the time of submitting this manuscript in October 2023, "EGT+Tri.Attn.+RDKitCoords." had not yet featured in the OGB leaderboard.Given these considerations, directly comparing "EGT+Tri.Attn.+RDKitCoords." with Uni-Mol+ in Table1might not accurately represent the state of research at the time of Uni-Mol+'s submission.To ensure clarity and provide context regarding the timeline of the works cited, we have added a footnote in Table Attn.+RDKit Coords."method in Table 1 for a comprehensive comparison.It's important to note that "EGT+Tri.Attn.+RDKitCoords." was first released in November 2023, approximately 8 months after the initial release of Uni-Mol+ in March 2023.Besides, "EGT+Tri.Attn.+RDKitCoords." was significantly influenced by

Table 2 :
The prediction error of increasing R on molecules grouped by different sizes.

Table 3 :
Data distribution of Training set.